Optional automatic image cleanup#171
Conversation
hiroTamada
left a comment
There was a problem hiding this comment.
this should be fine for the review, adding as a note for discussion
hiroTamada
left a comment
There was a problem hiding this comment.
looks good. the allow-list addition addresses the built-image concern nicely — deny-by-default with explicit opt-in per repository is the right approach. clean implementation overall: good separation of concerns, solid test coverage, proper race protection via MarkUsed, and conservative defaults (disabled + empty allow list + 30-day window).
…-auto-delete # Conflicts: # lib/instances/manager.go
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
| matcher = strings.ReplaceAll(matcher, `\*`, ".*") | ||
| matcher = strings.ReplaceAll(matcher, `\?`, ".") | ||
| return regexp.MustCompile(matcher).MatchString(repository) | ||
| } |
There was a problem hiding this comment.
Regex recompiled on every pattern match invocation
Low Severity
allowPatternMatches calls regexp.MustCompile on every invocation. Since isAllowed calls this for each ready image against each allowed pattern, and sweep runs every minute, the total regex compilations per sweep is O(images × patterns). The allowed patterns are static after controller construction, so the compiled regexes could be computed once in NewController or normalizeAllowedPatterns and reused.
| } | ||
| if err := validateDuration("images.auto_delete.unused_for", c.Images.AutoDelete.UnusedFor); err != nil { | ||
| return err | ||
| } |
There was a problem hiding this comment.
Missing positive duration validation for unused_for config
Low Severity
The images.auto_delete.unused_for field is validated only by validateDuration, which accepts any parseable Go duration including zero and negative values. A zero or negative value would cause images to be eligible for deletion on the very next sweep after being marked unused (~1 minute), rather than observing a meaningful retention window, contradicting the documented 720h default intent.


Summary
data_dir/images"*"as the global wildcard and an empty allow list deleting nothingConfig
Examples:
allowed: ["docker.io/library/*", "ghcr.io/kernel/*"]allowed: ["*"]Testing
Local:
make build-embeddedmake build-vz-shimgo test ./cmd/api/configgo test ./lib/imageretentiongo test ./cmd/apigo test ./lib/instances -run 'Test(CreateInstanceClearsRetentionStateBeforeMetadataSave|StorageOperations|Filter|Image)'go test ./cmd/api/config ./lib/imageretention ./cmd/api ./lib/instances -run 'Test(DefaultConfigIncludesMetricsSettings|LoadUsesDefaultImageAutoDeleteRetentionWindow|ValidateRejectsInvalidImageAutoDeleteUnusedFor|ValidateTrimsImageAutoDeleteAllowedPatterns|Sweep|AllowPatternMatches|MarkUsed|RunPerformsImmediateSweep|StartImageRetentionController|CreateInstanceClearsRetentionStateBeforeMetadataSave)'Deft (
deft-kernel-dev, fresh short-path workspace~/hmret):make ensure-ch-binaries ensure-firecracker-binaries ensure-caddy-binaries build-embeddedgo test ./lib/imageretention ./cmd/api/config ./cmd/apiTMPDIR=/mnt/data/home/sjmiller609/tmp go test ./lib/instances -run "TestCreateInstanceClearsRetentionStateBeforeMetadataSave" -count=1MarkUsedboth workNotes
images.auto_delete.allowedlist is a deliberate safety default: nothing is eligible for deletion until explicitly allow-listedNote
Medium Risk
Adds a background controller that can automatically delete cached image digests based on persisted retention state, which could cause unexpected data loss if misconfigured despite being disabled by default and gated by an allow-list.
Overview
Adds an optional server-wide image auto-delete feature for cached converted images under
data_dir/images, controlled via new configimages.auto_delete(enabled flag,unused_forretention window, and an explicit allow-list of repository patterns).Introduces a new
lib/imageretentioncontroller that persists per-digestunused_sincestate underdata_dir/system/image-retention, sweeps on startup and every minute, skips images referenced by instance metadata or snapshots, and deletes eligible digests (with metrics and logging). The API startup wires this controller behind the config flag, and instance creation now calls an optionalImageUsageRecorderhook to clear stale retention state before persisting new metadata.Written by Cursor Bugbot for commit 99a7b37. This will update automatically on new commits. Configure here.